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INTRODUCTION 


4 It  is  well  known  that  the  extent  of  whitecap  cover  on  the  surface  of  a 
sea  is  greatly  influenced  by  the  surface  windspeed  (Monahan  (1971),  Toba  and 
Chaen  (1973).  Wu  (1979),  Monahan  and  0 ' Muircheartai gh  (1980)).  Other 
variables,  such  as  sea  surface  temperature,  also  are  important,  but 
windspeed  action  appears  to  play  the  dominant  role.  Whitecap  cover  can  be 
remotely  sensed  while  windspeed  cannot,  so  it  is  tempting  to  utilize  the 
relationship  between  windspeed  and  whitecaps  to  infer  reasonable  values  for 
the  surface  windspeed.  To  do  so  requires  that  the  natural  causative 
relation  of  "whitecaps  windspeed",  quantitatively  estimated  from  field  data 
as  a  statistical  regression  of  (some  measure  of)  white  cap  coverage  on 
windspeed,  be  reversed.  It  turns  out  that  "natural"  way  of  solving  the 
problem,  namely  by  regressing  whitecap  cover  on  windspeed  and  then 
inverting  that  regression  relation,  actually  produces  results  that  are 
inferior  to  those  from  some  other  procedures.  Since  the  indirect  remote 
sensing  of  windspeed  is  of  operational  interest,  and  since  similar  problems 
may  well  arise  in  different  remote  sensing,  and  other,  areas  we  present 


illustrative  statistical  data  analyses  of  several  sets  of  whitecaps- 
windspeed  data  in  this  paper.  We  also  include,  in  later  sections  of  the 
paper,  further  similar  analyses  based  on  simulated  data. 

The  general  problem  considered  here  is  that  of  making  inferences  about 
an  unknown  p*1  vector  X'  from  a  single  random  observed  qxl  response  vector 
Y' .  The  relationship  between  Y  and  X  is  calibrated  with  experimental  data 
(Yj.X^),  i  -  1,2, ...,n  where  Y^,  X^  are  qxl  and  pxi  vectors,  respectively. 
The  case  p  «  q  -  1  has  been  extensively  discussed  in  the  literature,  and 
reference  will  be  made  below  to  several  basic  contributions  to  calibration 
methods  for  this  case.  The  situation  when  at  least  one  of  p,q  is  greater 
than  one  is  the  subject  of  a  comprehensive  paper  by  Brown  (1982). 

Brown  (1982)  distinguishes  two  cases  of  interest:  (a)  when  both  X  and 
Y  are  random  and  (b)  when  only  Y  is  random,  and  X  can  be  fixed  at  prechosen 
levels.  The  former  case  is  called  random  calibration,  and  the  latter 
controlled  calibration.  The  present  paper  is  concerned  solely  with  the 
problem  of  random  calibration,  because  the  data  of  interest  arises  in  an 
observational  context,  and  not  from  a  controlled  experiment. 

A  brief  outline  of  the  paper  is  as  follows:  In  Section  2  we  describe 
several  different  plausible  methods  of  point  estimation  in  univariate 
calibration.  The  methods  described  are  subsequently  applied  to  four  data 
3ets,  and  their  performance  evaluated  in  Section  3.  In  Section  H  we 
consider  four  interval  estimates  associated  with  the  calibration  problem, 
and  apply  them  to  the  data  sets.  The  problem  of  multivariate  calibration  is 
examined  in  Section  5.  Several  of  the  univariate  methods  are  extended  to 
this  situation  and  applied  to  the  same  four  data  sets,  and  to  a  further  set 
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provided  in  Brown  (1982).  The  application  and  an  evaluation  of  the  results 
are  presented  in  Section  6. 

The  later  sections  of  the  paper  consider  the  same  problems,  but  in  the 
context  of  a  simulation  study.  Section  ?  gives  a  brief  description  of  the 
objectives  of  the  simulation  study,  Section  8  describes  the  point  estimation 
results,  and  Section  9  those  related  to  interval  estimation. 

2.  THE  UNIVARIATE  PROBLEM 

The  simplest  version  of  the  calibration  problem,  and  the  one  most 
extensively  discussed,  is  the  case  p  »  q  =■  1  ,  and  where  the  calibration 
curve  is  linear  in  both  the  parameters  and  the  independent  variable.  The 
situation  of  interest  may  therefore  be  described  as  follows:  given  two 
random  variables  X,Y  with  the  relationship 

Y  -  a  *  8X  +  e  (2.1) 

2 

where,  most  classically  e  -  N(0,o  ),  and  given  n  independent  pairs  of 
observations  (X^.Y.)  on  (X,Y)  and  a  new  observation  yQ  on  Y,  how  do  we 
predict  or  estimate  the  corresponding  value  of  X  =  X(yQ).  Numerous 
solutions  have  been  proposed,  and  their  performances  evaluated.  Five  of 
these  methods,  in  particular,  have  been  applied  in  Section  3  to  four  data 
sets,  that  relate  whitecap  cover  to  surface  windspeed.  The  four  methods 
examined  are  these: 

(i)  the  so-called  classical  method  viz.,  estimate  a, 8  in  equation  (2.1) 
by  least  squares,  and  then  for  Y  =  yQ  the  predicted  value  of  X  is 
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(2.2) 


B  "  0  . 


(ii)  Krutchkoff  (1967)  suggested  another  estimator  obtained  by  rewriting 
(2.1)  as 


X  -  Y  ♦  6Y  ♦  e  (2.3) 

A  A 

and  obtaining  least-squares  estimators  Y,  5  of  Y,  6;  the  predicted  value  of 
X,  given  Y  -  yQ  will  then  be 

A  a  a 

Xj  *  Y  +  %  • 

so  denoted  because  it  is  known  as  the  inverse  estimator . 

Krutchkoff  (1967)  concluded  by  means  of  a  Monte  Carlo  study  that  X^  had 

uniformly  smaller  mean  squared-error  (MSE)  than  the  classical  estimator 
In  a  later  (1969)  paper  he  concluded  that  this  result  was  valid  only  within 
the  calibration  range,  whereas,  in  fact,  the  reverse  result  held  outside 
that  range.  Williams  (1969)  pointed  out  that  for  finite  samples  the  MSE  of 
the  classical  estimator  was  infinite  and  that  of  the  inverse  estimator 
finite,  thus  the  use  of  the  MSE  for  comparing  these  estimators  is 
unsatisfactory . 

(iii).  Lwin  &  Maritz  (1980)  proposed  an  alternative  estimator  based  on 
the  fact  that  for  this  particular  problem,  the  predictor  of  XQ  given  by 
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(2.4) 


X»(y0)  -  E{x|y-y0} 


has  minimum  mean-squared  error  (provided  c  and  a,  B  are  all  known).  By 
using  consistent  estimators  of  a,  a,  and  B  and  by  approximating  the  marginal 
distribution  of  X  with  the  corresponding  empirical  distribution  function, 
Lwin  &  Maritz  showed  that  the  estimator 


l  x1fKy0  -  a  -  ex.)/a} 

i»1 _ 

n  y  -  a  -  Bx 

I  - 4 


(2.5) 


will,  subject  to  easily  satisfied  regularity  conditions,  tend  to  the  optimal 
~  * 

estimator  X  (yQ)  in  mean  square,  where  f  is  the  error  density  function 
(presumed  known;  otherwise  estimated). 

(iv)  A  Bayesian  methodology  was  introduced  by  Aitchinson  &  Dunsmore 
(1975).  This  method  involves  the  assumptions  that 

(a )  X ,  Y  are  Normal , 

(b)  Y  -  N(a  +  8x,a2) 

From  these  assumptions,  it  can  be  shown  that  the  predictive  distribution  for 
XQ,  given  n  pairs  of  observations  (X^.Y^)  and  a  single  new  observation  yQ  is 
proportional  to 


Z(x  -x)2 

St{n-l,x,(1^) - (  St{n-2,m,(l+-)^} 


(2.6) 
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where 


1 

k 


v  “  n-2 


and  St{k,b,c}  is  the  usual  non-central  Student's  t-distri bution  with  density 
function  given  by 


f (u; c,b  ,k ) 


_  ,1  1.  w.  .1/2r,  .-1.  .  ,2i(k+l)/2 

Be  2’^  kC  i,+0<c)  (u-b)  | 


(2.7) 


The  constant  of  proportionality  in  (2.6)  must  be  obtained  by  numerical 
integration.  The  predictive  distribution  of  (2.6)  enables  us  to  obtain 
either  point  or  interval  estimates  of  .  The  point  estimates  examined  are 

(a)  mean  of  predictive  distribution  distribution,  and  (b)  mode  of 

ME 

predictive  distribution,  X..^. 

Mu 

We  have,  therefore,  five  different  estimators  to  be  compared: 

(i)  the  classical  predictor  X 
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(ii)  the  inverse  predictor  X^ 

(iii)  the  empirical  predictor  XE 

(iv)  the  mean  of  the  predictive  distribution  X..„ 

Mb 

(v)  the  mode  of  the  predictive  distribution  X... 

MU 

3.  COMPARISON  OF  UNIVARIATE  PREDICTORS 
3. 1  The  Data 

The  five  predictors  were  compared  by  applying  them  to  four  data 
sets.  The  data  sets  consist  of  measurements  of  instantaneous  oceanic 
whitecap  coverage  (Y)  and  wind  speed  (X),  and  the  object  of  the  exercise  is 
the  prediction  of  XQ  given  a  new  observation  YQ.  An  initial  inspection  of 
the  data  suggested  lognormal  distributions  for  both  X  and  Y  and  a  log 
transformation  gave  an  acceptable  fit  to  a  Normal  distribution.  Data  points 
for  which  whitecap  coverage  was  0.0  were  excluded  from  the  analysis  for 
several  reasons,  but  particularly  because  it  seemed  reasonable  to  assume 
that  a  zero  whitecap  coverage  gave  no  additional  information  in  relation  to 
wind  speed  over  and  above  the  conditional  distribution  of  wind  speed  given 
zero  whitecap  coverage.  The  data  sets  involved  were  the  following: 

Data  set  1:  Monahan  (1971) 

Data  set  2:  Toba  &  Chaen  (1973) 

Data  set  3:  JASIN  experiment  (1973),  (Monahan  et  al .  (1981)) 

Data  set  4 :  Strex  experiment  (1981),  (Monahan  et  al .  (1981)) 

The  number  of  (pairs  of)  non-zero  observations  in  the  respective  data  sets 
were  4j,  18,  37  and  78. 


3.2  Method  of  Comparison  of  Estimators 


For  each  data  set,  we  excluded  one  data  point  at  a  time  and 
obtained  each  of  the  five  estimators  based  on  the  remaining  data.  We  then 
predicted  the  x-value  of  the  excluded  point,  given  the  y-value  of  that 
point,  using  each  of  the  five  estimators.  This  provided  five  predicted  x- 
values  for  each  point  in  each  data  set.  Finally,  for  each  of  the  five 
estimators  and  for  each  data  set,  we  calculated  the  mean  bias  (MB)  and  the 
mean-squared  prediction  error  (MSPE)  defined  as  follows  for  a  given  data 
set: 


(3.1) 

(3.2) 

where  n  is  the  number  of  points  in  the  data  set. 

3.3  Results 

The  results  are  presented  in  Tables  1  and  2. 


MB  =*  T(x.  -  x,)/n 
L  i  i 

MSPE  -  T(x.  -  x  )2/n 
L  i  i 
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TABLE  1 


Bias 

of  Estimators 

Estimator 

a 

a 

Data  Set  xp 

XI 

XE 

XME 

XM0 

1  .0150 

.0038 

-.0039 

.1969 

-.0050 

2  .0119 

-.0068 

-.0141 

-.1110 

-.0092 

3  .0055 

.0019 

-.0151 

.2831 

.0080 

4  -.0014 

.0004 

.0030 

.0415 

.0006 

Table  1  shows  that  , 

in  terms  of 

bias,  the  estimator  x  ( 

i  .e . ,  the  mean  of 

the  predictive  distribution  of  x)  is  uniformly  the  worst  of  the  five 

estimators  and  the  inverse  estimator  xT  almost  uniformly  the  least  biased. 

The  estimator  xw„  is  close  to  tut  slightly  worse  than,  xT  in  terms  of  bias. 
MO  I 

A  two-way  analysis  of  variance  applied  to  the  data  of  Table  1  yielded  the 


obvious  results  in  terms  of  significance. 


TABLE  8.2 


-squared 


.  1 


.5 


MSE  of  Various  Predictors 


X: 

Normal  Error : 

t,  3  d.f. 

Estimator 

N  =  20 

N  =  MO 

N  =  80 

El 

1 . 0M 

0.95 

0.98 

E2 

95.13 

1  Ml .63 

19.67 

E3 

0.97 

0.92 

0.90 

EM 

1.00 

0.95 

0.97 

E5 

0.99 

0.95 

0.95 

E6 

0.96 

0.88 

0.85 

E7 

0.93 

0.87 

0.83 

E8 

1.03 

0.95 

0.98 

El 

0.57 

0.M9 

0.51 

E2 

1.32 

1  .02 

1 .03 

E3 

0.51 

0.M8 

0.M8 

EM 

0.56 

0.  M9 

0.51 

E5 

0.55 

0.M9 

0.50 

E6 

0.50 

0.M7 

0 .  MM 

E7 

0.50 

0.M6 

0.M3 

E8 

0.56 

0.M9 

0.51 

El 

0. 1  1 

0.  1 1 

0.12 

E2 

0.  12 

0.12 

0.  13 

E3 

0. 1  M 

0. 1  1 

0.1 1 

EM 

0.  1  1 

0.  1  1 

0.  12 

E5 

0.12 

0.  1 1 

0.  12 

E6 

C.  16 

0.  12 

0.  1  1 

E7 

0.  17 

0.  12 

0.  1  1 

E8 

0.  1  1 

0.  1  1 

0.  12 
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TABLE  8.  1 


p-squared 


.  1 


.5 


MSE  of  Various  Predictors 


X :  Normal 

Error : 

Normal 

Estimator 

N  -  20 

o 

n 

N  =  80 

El 

1.02 

0.95 

0.92 

E2 

149.50 

2063.04 

174.14 

E3 

1.02 

0.95 

0.92 

E4 

1.01 

0.95 

0.92 

E5 

1.01 

0.95 

0.92 

E6 

1  .07 

1  .00 

0.96 

E7 

1  .01 

0.96 

0.92 

E8 

1.01 

0.95 

0.92 

El 

0.56 

0.52 

0.51 

E2 

1.51 

1.21 

1  .04 

E3 

0.58 

0.53 

0.51 

E4 

0.56 

0.52 

0.51 

E5 

0.56 

0.52 

0.51 

E6 

0.63 

0.58 

0.56 

E7 

0.62 

0.56 

0.53 

E8 

0.56 

0.52 

0.51 

El 

0.  1 1 

0.10 

0.  10 

E2 

0.13 

0.  12 

0.11 

E3 

0.15 

0.  12 

0.  1  1 

E4 

0.  1  1 

0.  10 

0.  10 

E5 

0.11 

0.10 

0.10 

E6 

0.  18 

0.13 

0.  13 

E7 

0.18 

0.13 

0.  13 

E8 

0.11 

0.  10 

0.10 
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The  overall  exercise  was  repeated  for  each  of  the  following 


combinations 

of 

assumptions  regarding 

the  forms 

of  the  distribution  of  X 

Y: 

1. 

X: 

N(0, 1) 

Error : 

Normal,  mean  0 

2. 

X: 

N(0, 1 ) 

Error : 

t  3  d.f. 

3- 

X: 

N(0, 1 ) 

Error : 

Stretched  Normal  (Gaver 

(1982)) 

4. 

X: 

t,  3  d.f.,  variance 

1  Error: 

Normal ,  mean  0 

5. 

X: 

t,  3  d.f.,  variance 

1  Error: 

t,  3  d.f. 

The  error  variance  in  each  case  was  fixed  so  as  to  give  the  required 
correlation  between  X  and  Y. 

The  results  arising  from  each  series  of  assumptions  are  presented 
in  Tables  8.1  through  8.5,  respectively. 


estimator.  Therefore,  certainly  for  large  samples,  we  would  expect  the 

performance  of  the  classical  estimator,  which  in  general  is  not  good,  to 
2 

improve  as  p  -*■  1 .  Note  further  that  (see  Appendix  A)  the  estimator  E8  is 
virtually  identical  with  El  for  any  reasonably  large  sample  size  N,  thereby 
providing  justification  for  the  use  of  the  estimator  El. 

8.2  Simulation 

The  criterion  of  comparison  of  the  different  estimators  is  their 
mean-squared  error  of  prediction.  The  basic  assumption  is  that  we  have  two 
random  variables  X,  Y  such  that 


E  { y  |  x }  -  a  +  8X 


V(y | x  )  -  Oy 


(8.3) 


The  study  involves  a  number  of  different  assumptions  concerning  the  form  of 
the  distributions  of  X  and  Y  | X  and  these  are  detailed  below.  The  (true) 
values  of  a  and  8  are  taken  to  be  0  and  1,  respectively.  An  initial  random 
sample  of  size  n  is  generated  from  which  the  predictive  relation  is  derived. 
Then  100  further  pairs  of  observations  were  simulated  from  the  same  true 
model,  and  a  prediction  of  the  x-value  corresponding  to  each  y-value  is  made 
using  each  of  the  eight  estimators. 

The  above  exercise  was  carried  out  2000  times  for  every 
combination  of  the  following  parameters: 

Sample  size  N  20,  M0,  80 

Squared  corr .  coefficient  .1,  .5,  .9 


8. 1  The  Point  Estimators 

The  estimators  being  compared  are  the  five  referred  to  above  with 
the  following  additions: 

(a)  Two  alternative  versions  of  estimator  E3  [the  Empirical  Bayes 
estimator]  are  developed,  viz., 

E6 :  assuming  the  errors  follow  a  Student  t-distribution,  and  estimating  its 
variance  in  the  standard  manner  and 

E7 :  as  in  (i),  except  that  we  use  a  maximum  likelihood  estimate  of  the 
variance  of  the  t-distribution. 

(b)  A  further  alternative  version  of  estimator  E3  is  derived  by  assuming 
that  the  distributions  of  X  and  Y [ X  are  N(ux,a2)  and  N(a+BX,o2), 
respectively.  Then,  by  straightforward  probability  calculus  we  have 


f  ( x  |  y ) 


B(y-a) q2  +  »xo* 
BV  ♦  a2  ’ 

x  y 


(8.1) 


Hence  an  "empirical"  Bayes  estimator  of  X  given  Y  is 


O  v' s*  V  «  ’  •  '  ‘  *.  *  -  *  •  •  .  *  *  •  • 

'V  y  •*  .*  .*  ,•  -  .*  » 


19 


El 

(i) 

the  inverse 

E2 

(ii) 

the  classical 

E3 

(iii ) 

estimated  empirical  Bayes 

E4 

(iv) 

mean  of  predictive  distribution 

E5 

(v) 

mode  of  predicted  distribution 

together  with  corresponding  interval  estimators,  each  of  which  is  defined  in 
Section  3.  The  general  conclusion  drawn  was  that,  with  the  exception  of 
estimator  (ii),  which  was  considerably  inferior,  all  the  other  estimators 
are  broadly  comparable  in  terms  of  predictive  performance.  This  conclusion 
is  supported  by  the  results  of  several  previous  studies. 

In  this  section  we  further  evaluate  the  performance  of  these  estimators 
by  computer  simulation.  We  concentrate  in  particular  on  the  robustness  of 
the  estimators,  and  on  the  effect  of  sample  size  on  the  predictive  ability 
of  the  estimators.  The  classical  assumption  is  that  both  variables  in  the 
calibration  study  have  normal  distributions;  this  is  the  first  situation  we 
have  studied.  We  have  subsequently  allowed  for  non-normal  distributions  for 
each  variable  in  turn,  and  for  both  together.  Another  factor  which  has 
emerged  as  being  of  importance  in  determining  the  relative  and  absolute 
merits  of  the  different  estimators  is  the  (true)  correlation  between  the  two 
variables,  and  the  effect  of  this  factor  has  also  been  examined. 

This  section  is  divided  into  two  parts;  the  first,  Section  8,  is 
concerned  with  the  point  estimators,  and  the  second,  Section  9,  with 
interval  estimators. 


8.  COMPARISON  OF  POINT  ESTIMATORS 


analysis  by  us  for  15  other  random  samples  of  size  5  yielded  an  average  of 
just  under  98%  of  variation  explained.) 

Another  interesting  outcome  of  this  analysis  is  the  relatively  poor 
performance  of  the  method  E  for  this  data  set.  Our  results  confirm  those  of 
Brown,  and  in  fact  indicate  that  E  is  worse  than  in  his  analysis. 
Incidentally,  an  examination  of  the  w^  (weights)  involved  in  method  E 
reveals  that  when  we  go  to  the  multivariate  case  we  are  dealing  with 
extremely  small  numbers  (<<  exp(-30))  and  for  this  reason  the  method  may  be 
very  susceptible  to  differences  in  computational  precision  in  this  case. 
The  method  held  up  well  for  the  wind/whitecap  multivariate  extension  (which 
involved  the  inclusion  of  additional  X's)  but  has  not  performed  well  in  this 
case  with  the  inclusion  of  additional  Y's.  This  may  be  because  the 
inclusion  of  additional  Y's  increases  the  dimension  of  the  regression 
density  function,  whereas  the  inclusion  of  additional  X's  does  not. 

In  fact,  in  view  of  the  results  presented  in  later  sections,  a  number 
of  aspects  of  the  analysis  of  this  data  set  are  not  at  all  surprising. 
Firstly,  since  the  data  indicate  a  very  strong  underlying  correlation,  it  is 
to  be  expected  that  the  classical  estimator  will  perform  well.  Secondly, 
for  the  same  reason,  we  can  expect  the  Lwin  &  Maritz  type  estimator  to 
perform  poorly. 

7.  THE  SIMULATION  STUDY 

In  Section  1,  we  evaluated  the  performance  of  a  number  of  point  and 
interval  estimators  of  wind  speed  given  whitecap  coverage  when  applied  to 
each  of  four  data  sets.  The  five  estimators  involved  were: 
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predictive  capacity  over  the  "best"  univariate  predictor  XT(XTD).  Among  the 

1  Lb 

A  A  A 

truly  multivariate  of  these  methods  [X,  .X,,],  the  empirical  X„  holds  up 

L  r*  h 

extremely  well,  whereas  the  classical  multivariate  again  is  uniformly  the 
worst . 

In  Table  4  we  present  the  results  for  the  Brown  data: 

TABLE  4 


Mean  Squared  Prediction  Error 


Method  -» 

Variable 

* 

L 

L’ 

E 

E' 

LB 

X1 

.003 

.003 

.031 

.017 

.003 

X2 

.041 

.041 

.298 

.076 

.041 

A  comparison  of  the  columns  of  Table  4  confirms  the  result  of  Brown  (1982) 
that  the  methods  L,  LB  are  virtually  indistinguishable  in  terms  of 
predictive  performance  for  this  data  set.  This  is  at  variance  with  all 
previous  univariate  results,  and  with  the  multivariate  conclusions  for  the 
wind/whitecap  data.  As  printed  out  in  Brown  (1982),  these  results  should  be 
treated  with  some  caution,  as  the  data  are  perhaps  atypical  in  that  such  a 
large  percent  of  the  variation  is  explained  by  the  model.  (Brown  predicted 
the  x-values  of  5  points  using  the  remaining  16,  and  found  for  methods  L,  LB 
in  all  cases  over  98%  of  variation  explained  by  the  model.  A  similar 
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We  consider  the  results  for  the  Brown  data  and  the  Wind/Whitecap  data 
separately.  In  Table  3  we  present  the  results  for  the  data  of  Section  3* 


TABLE  3 


Mean  Squared  Prediction  Error 


Predictor 

Data  Set 

XL 

XL- 

XLB 

XE 

XE- 

XLB* 

1 

.095 

.192 

.059 

.059 

.056 

.040 

2 

.550 

.205 

.082 

.079 

.086 

.  110 

3 

.114 

.  103 

.  066 

.061 

.060 

.072 

4 

.148 

.149 

.061 

.062 

.062 

.056 

Before 

comparing  these  predictors,  a 

number  of 

points  should  ! 

be  noted. 

(1)  xL 

,  is  simply  the  classical  estimator  when 

only  wind  and 

whi tecap 

variables 

are  taken 

into 

account  so 

that  this 

is  identical 

with  the 

estimator  X^  of  Section  2. 

(ii)  By  definition,  X  predicts  each  component  of  X  separately  and 

L  D 


hence  this  also  is  the  univariate  X^.  (since  Y  has  only  one  component  here). 

(iii)  Included  in  column  6  of  Table  3  is  the  predictor  Xin„,  obtained 

LB* 

simply  by  regressing  the  wind  variable  on  all  other  variables  in  the 
analysis  [whether  X  or  Y  ] . 

A  comparison  of  the  columns  of  Table  3  reveals  that  none  of  the 
multivariate  methods  used  leads  to  any  noticeable  improvement  in  terms  of 


V-V-V- ' 
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where  X|  is  the  ith  observation  on  X  and 

n 

w  -  f(Y’|X  )/  l  f(Y'|X.)  (5.6) 

1  i-1  1 

In  the  case  of  our  analysis,  f  was  assumed  to  be  the  multivariate  normal 
regression  density  (Mardia  £t  al.  (1979),  ch.  6)  with  parameters  fixed  at 
their  least  squares  values.  It  is,  of  course,  also  possible  to  obtain  the 

A 

estimator  X£  for  the  problem  of  predicting  each  component  of  X  separately, 

A 

given  This  estimator  is  denoted  by  Xg, ,  following  Brown  (1982). 

The  above  5  predictors  were  applied  to  five  data  sets,  constituted  as 
follows: 

(a)  the  four  data  sets  of  Section  2,  each  augmented  by  the  inclusion  of 
additional  x-variables ,  viz.,  surface  water  temperature,  and  air  temperature 
[i.e.,  q  -  1,  p  -  31. 

(b)  the  data  set  provided  in  Brown  (1982),  Section  4,  relating  four 
infrared  reflectance  responses  of  wheat  (Y)  to  determination  of  percent 
water,  X1  ,  and  percent  protein,  X^,  of  the  wheat  [i.e.,  q  »  4,  p  =  2]. 

The  various  predictors  were  compared  using  the  same  criteria  as  in 
Section  2,  viz.,  the  mean-squared  prediction  error  where  one  point  at  a  time 
is  omitted,  and  then  the  x~value  for  that  point  is  predicted  using  all  the 
remaining  points  for  estimation  purposes. 


6.  ANALYSIS  OF  RESULTS  FOR  MULTIVARIATE  CASE 


where  yQ  is  the  newly  observed  single  value  of  Y  which  we  are  to  use  to 

A  A 

predict  X,  and  B,  S  are  the  usual  least  squares  estimators  of  B,  Z  (Mardia, 

a 

et  al . ,  1979).  Note  that  if  we  replace  B,  S  by  their  univariate 

counterparts,  and  putting  a  -  0  (following  centering  of  the  data)  equation 
(5.3)  does,  as  expected,  reduce  to  equation  (2.2). 

The  analysis  which  produces  X  here  performs  a  multivariate  regression 

V-» 

of  Y  on  X.  Brown  (1982)  suggests  an  alternative  predictor  XL, ,  where  in 

attempting  to  predict  a  component  of  X  (say  X  )  we  regress  Y  on  X  .  alone, 

J  J 

and  obtain  X  ,  by  a  formula  analogous  to  (5.3). 

L 

II.  From  multivariate  regression  of  X  on  Y  (denoted  LB  in  Brown  (1982)) 

XLB  -  ^(Y’p'Vx  (5.4) 

Note  that  in  this  case,  each  component  of  X  is  predicted  ignoring  all  the 
other  components  of  X--in  effect  we  carry  out  a  multiple  (not  a 
multivariate)  regression  of  each  component  of  X  on  Y. 

III.  A  generalization  of  the  empirical  method  of  Lwin  &  Maritz  (denoted  E 
in  Brown  (1982)).  The  extension  is  straightf  orward .  Like  (L)  it  uses 
the  parametric  regression  of  Y  on  X  and  derives  that  of  X  on  Y  by  means  of 
the  empirical  distribution  of  X.  Specifically,  if  ^  is  a  new  (q*1) 
observation  on  Y  the  prediction  for  the  corresponding  X'  (pxl)  is 

X£  =  l*L  X^  (5.5) 
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5.  THE  MULTIVARIATE  CASE 

Brown  (1982)  has  studied  the  case  p  >  q,  not  both  1,  in  some  depth. 
While  most  of  his  analysis  relates  to  the  case  of  controlled  calibration 
(i.e.,  X  not  random),  he  does  devote  some  attention  to  the  random 
calibration  situation.  The  model  employed  is  a  generalization  of  (2.1), 
viz. , 


Y  -  1aT  +  X  B  +  E  (5.1) 

where  Y  (nxq),  E  (n»q)  and  X  (n*p)  are  random  matrices,  and  E  is  a 

disturbance  matrix  from  N  (0,£).  If  units  of  X  and  Y  are  chosen  so  that  the 

q - 

variables  are  post  hoc  centered  at  zero,  we  can,  without  loss  of  generality, 
rewrite  equation  5.1  so  that  the  constant  term  disappears  and  hence  we  have 

Y  -  X  B  ♦  E  (5.2) 

Brown  (1982)  suggests  three  estimators  for  the  multivariate  situation. 

A  A  A 

These  are  analogous  to  the  predictors  X„  ,  XT  and  X_  of  Section  2  and  are 

U  1  Cj 

derived  as  follows: 

A 

I.  From  regression  of  Y  on  X  (denoted  L  by  Brown),  and  analagous  to  X^. 

A  A  _  1  *  _  1  *  _  1 

XL  -  (B  S  B)  B  S  ^  (5.3) 
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11.  The  standard  95$  confidence  interval  based  on  the  inverse 
regression — i.e.,  regr^jrion  of  U  on  W. 

12.  An  interval  based  on  the  Lwin  &  Maritz  estimator,  and  using  the 
standard  deviation  of  the  closely  related  estimator  E8,  derived  in 
Section  8. 

A 

13.  An  interval  based  on  the  classical  estimator  xc ,  and  described 
in  Brownlee  (1965) 

The  results  are  presented  below: 

TABLE  3 


Confidence 

Intervals 

Data  Set 

11 

12 

13 

$  cov 

Av  length 

$  cov 

Av  length 

$  cov 

Av  length 

1 

97.7 

.90 

97.7 

.89 

96.3 

1  .72 

2 

96.6 

1.04 

94.6 

1.01 

95.6 

2.23 

3 

89.2 

.96 

94.6 

.95 

90.3 

1  .84 

4 

93-5 

.96 

92.3 

.96 

94.2 

1.62 

In  general ,  intervals  II  and  12  are  very  comparable.  The  analysis  was 
performed,  as  in  the  case  of  point  estimation,  by  omitting  each  point  in 
turn  and  constructing  a  confidence  interval  based  on  an  analysis  of  all  the 
remaining  points  of  the  data  set. 


TABLE  2 


Mean-Squared  Prediction  Error 
Estimator 


A  A  A  A  A 


Data  Set 

xc 

XI 

XE 

XME 

XM0 

1 

.192 

.059 

.056 

.060 

.060 

2 

.205 

.082 

.086 

.082 

.083 

3 

.103 

.066 

.060 

.067 

.067 

4 

.149 

.061 

.062 

.061 

.061 

This  table  shows  that  in  terms  of  average  squared  prediction  error,  the 
classical  predictor  is  once  again  uniformly  the  poorest,  having  mean 
prediction  error  in  the  range  2  to  3  times  that  of  any  of  the  other 
estimators.  The  remaining  four  estimators  are  very  close  in  terms  of 
predictive  capacity  for  those  data  sets,  with  none  uniformly  better  than  the 
others.  Once  again,  a  two-way  ANOVA  yielded  the  expected  results. 

One  advantage  of  the  Aitchison  and  Dunsmore  method  is  that  it 
produces,  in  addition  to  the  point  predictions,  the  predictive  distribution 
of  x  given  y  »  yQ.  From  this  it  is  possible  to  obtain  shortest  100(aJ) 
confidence  intervals  for  x  given  y  =  y^. 

4.  INTERVAL  ESTIMATION  FOR  THE  WIND/WHITECAP  DATA 

For  each  predicted  value,  and  for  each  data  set,  the  following  95% 
confidence  intervals  have  been  constructed. 
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L  <  1^1  -  1  -  ■ 
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TABLE  8.3 


MSE  of  Various  Predictors 
X:  Normal  Error:  Stretched  Normal 
p-squared  Estimator  N  -  20  N  ■  40  N 


El 

E2 


1.00 

3230.46 


0.97 

66.92 


0 

14 


TABLE  8.4 


MSE  of  Various  Predictors 
X:  t,  3  d.f.  Error:  Normal 
p-squared  Estimator  N  -  20  N  »  40 


El 

0.99 

0.95 

E2 

2919.72 

0.99 

E3 

0.92 

0.93 

E4 

0.98 

0.95 

E5 

0.98 

0.95 

E6 

0.96 

1 .00 

E7 

0.92 

0.95 

E8 

0.98 

0.95 

El 

0.58 

0.55 

E2 

1 .82 

1.14 

E3 

0.61 

0.58 

E4 

0.59 

0.55 

E5 

0.59 

0.57 

E6 

0.72 

0.70 

E7 

0.69 

0.69 

E8 

0.59 

0.55 

El 

0. 1 1 

0.  10 

E2 

0.13 

0.  1 1 

E3 

0.21 

0.18 

E4 

0.11 

0. 1 1 

E5 

0.1 1 

0.12 

E6 

0.35 

0.33 

E7 

0.35 

0.34 

E8 

0.12 

0. 1  1 

25 


TABLE  8..  5 


p -squared 


MSE  of  Various  Predictors 
X:  t,  3  d.f.  Y:  t,  3d 
Estimator  N  -  20  N 


El 

E2 


1.09 

1652.77 


0.87 

1.00 


87 

84 


8. 3  Discussion  of  Simulation  Results 

The  criterion  used  for  comparison  of  estimators — the  mean-squared 
error — is,  of  course,  scale  dependent,  and  therefore  the  only  meaningful 
comparison  between  estimators  is  the  percentage  difference  in  mean  squared 
error . 

Looking  first  at  Table  8.1  (X  and  the  error  both  Normal),  we  see 
that  El,  E4 ,  E5  and  E8  are  virtually  indistinguishable  in  terms  of 

predictive  performance.  The  Lwin  &  Maritz  type  procedures  (E3,  E6,  and  E7) 

2 

are  somewhat  inferior  particularly  for  small  sample  size  and/or  large  p  . 

2 

For  example,  for  N  -  20,  p  -  .9,  the  appropriate  Lwin  &  Maritz  estimator 

(E3)  is  approximately  36$  worse  than  the  four  "good"  estimators  in  terms  of 

mean  squared  error.  The  classical  estimator  (E2)  turns  out  to  be  just  as 

bad  as  might  be  expected  from  previous  studies,  although  it  does,  as  we 

2 

predicted  it  should,  appear  to  improve  as  p  increases. 

In  Table  8.2  (X  Normal,  error  having  a  t-distribution)  ,  the  four 
estimators  El,  EH ,  E5  and  E8  are  again  essentially  identical  in  their 
performance.  The  classical  estimator  is  again  poor,  with  the  same  proviso 

as  above.  However,  the  Lwin  &  Maritz  type  estimators  (E5,  E6,  E7)  now 

2 

perform  very  well,  except  for  a  combination  of  small  N  and  large  p  .  The 

superiority  of  the  most  appropriate  (and  best)  of  these  estimators  (E7)  is 

2 

of  the  order  of  10  to  15  percent  reduction  in  mean  squared  error  for  p  in 

2 

the  range  .1  to  .5,  although  for  larger  p  and  small  sample  sizes  this 

2 

difference  is  smaller,  and  in  one  case  is  reversed  (N  =  20,  p  =  .9).  This 
is  a  general  pattern  that  has  emerged;  The  Lwin  &  Maritz  type  estimators  do 


2 

not  perform  well  for  large  p  ,  particularly  when  the  corresponding  sample 
sizes  are  small. 

In  Table  8.3  (X  still  Normal,  the  error  having  even  longer,  more 
straggling  tails),  the  pattern  is  very  similar  to  that  of  Table  7.2.  Once 
again  the  estimators  El,  ,  E5,  E8  are  broadly  comparable.  E2  is  poor,  and 
estimators  E3,  E6  and  E7  are,  with  the  type  of  exception  mentioned  above, 
superior  (involving  a  reduction  of  up  to  about  12%  in  mean  squared  error). 

In  the  remaining  tables,  we  allow  X  to  be  non-Normal.  In  the  case 
of  Table  8.4  (X,  t  distribution,  error  Normal),  estimators  El,  E4,  E5 ,  and 
E8  are  still  virtually  identical.  E2  is  still  the  worst,  but  E3  (which  one 

would  expect,  given  its  definition,  to  be  good  in  this  case)  i3  superior 

2  2 
only  for  small  p  ,  and  this  superiority  is  most  marked  for  small  p  ,  and 

2 

this  superiority  is  most  marked  for  small  N.  For  moderate  p  (.5),  E3  is 

2 

marginally  worse  than  El,  E4,  E5,  and  E8,  and  for  large  p  ,  E3  is  distinctly 

inferior.  E6  and  E7  are,  in  general,  as  might  be  expected  in  this  case, 

very  poor  in  their  predictive  capacity. 

Finally,  in  Table  8.5  we  have  the  case  where  neither  X  nor  the 

error  is  Normally  distributed.  The  pattern  of  Table  7.6  continues  here, 

except  that  the  cases  where  E3  is  superior  are  even  more  limited,  and  the 

2 

inferiority  of  E6  and  E7  for  large  p  even  more  pronounced. 

Some  general  conclusions  can  now  be  drawn  from  the  combined 

results: 

2 

1.  For  all  p  ,  and  all  N,  regardless  of  underlying  distributions,  the 
estimators  El,  E5 ,  E6,  and  E8  are  indistinguishable  in  terms  of  predictive 
performance,  when  that  performance  is  measured  in  terms  of  MSE. 
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2.  When  both  X  and  the  error  are  Normal,  one  of  the  estimators  El,  E4, 

E5,  and  E8  should  be  used.  The  Lewin  &  Maritz  type  estimator  can  be 

2 

inferior  in  this  case,  particularly  for  large  p  and  small  N. 

3.  When  X  is  Normal,  but  the  error  is  not,  the  LM  estimators  can  be 

2 

superior,  except  when  there  is  a  combination  of  high  p  ,  and  small  N. 
Modifying  the  LM  estimator  to  take  account  of  the  form  of  the  error 
distribution  (E6,  E7)  does  lead  to  further  reduction  of  the  mean-squared 
error . 

4.  When  X  is  long-tailed  non-Normal,  the  range  of  superiority  of  the  LM 

2 

estimators  is  very  limited — in  fact  it  only  occurs  for  small  p  ,  and  is  most 
marked  for  small  N.  Calibration  is  probably  not  a  very  appropriate 
technique  in  that  situation.  Therefore,  when  X  is  non-Normal,  one  should 
probably  utilize  one  of  the  estimators  El,  E4,  E5  or  E8. 

5.  The  estimator  E8,  which  has  not  been  studied  before,  performs  very 
well  in  general.  The  inverse  estimator  El  performs  equally  well,  but 
estimator  E8  has  some  appealing  properties,  viz. 

(i)  it  can  be  derived  directly  from  our  assumptions  (7.3)  and 

(ii)  it  leads  to  a  simple  and  reliable  confidence  interval  (cf. 
Section  8) 

(iii)  Simple  algebra  (Appendix  A)  will  show  that  E8  is  essentially 
almost  identical  with  El,  thus  providing  justification  for  use 
of  El  . 

6.  Estimators  EU  and  E5  also  perform  well,  but  are  computationally  more 
difficult  to  obtain,  and  do  not  yield  easily  computable  confidence 
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intervals.  However,  they  do  give  a  readily  computable  predictive 
distribution. 

7.  Sample  size  is  not  a  major  factor  in  the  absolute  size  of  the  mean- 

squared  error.  Reading  across  any  of  the  rows  of  any  of  the  tables  8.1 

through  8.5,  we  see  relatively  little  reduction  in  MSE  as  we  go  from  N  »  20 

to  N  ■  40  to  N  -  80.  The  reduction  is  certainly  small  compared  to  the 

2  2  2 

reduction  as  we  go  from  p  •  .1  to  p  ■  ,5  to  p  -  .9.  This  is  not,  of 
course,  very  surprising:  it  merely  indicates  that  the  main  determining 
factor  in  the  predictive  capacity  of  the  various  calibration  estimators  is 
the  strength  of  the  actual  (linear)  relationship  between  the  relevant 
variables.  Nevertheless,  certain  ways  of  processing  the  data  can  have 
considerable  advantages. 

9.  COMPARISON  OF  INTERVAL  ESTIMATORS 
9. 1  The  Interval  Estimators 

Although  numerous  point  estimators  have  been  derived  and  studied 
in  connection  with  the  calibration  problem,  the  study  of  the  interval 
estimation  problem  has  been  much  less  extensive.  In  this  section  we 
examine,  again  by  means  of  simulation,  the  performance  of  a  number  of 
interval  estimators.  These  estimators  are  as  follows. 

1.  For  the  point  estimator  El,  we  use  the  standard  95J  confidence 
interval  for  the  predicted  value  of  X,  given  y  =■  y^,  viz. 


,  (y-y0)  1/2 

El  ±  (1  ♦  jj-  ♦  — g - )  x  s  *  t 

yy 


0.025 
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2.  Brownlee  (1965)  has  suggested  a  95%  confidence  interval  related  to  the 
approach  of  point  estimator  E2.  This  interval,  referred  to  in  the  following 
tables  as  the  classical  interval,  has  the  disadvantage  that  it  fails  to 
exist  in  certain  circumstances.  Its  performance  is  examined. 

3.  An  empirical  Bayes-type  confidence  interval  based  on  the  derivation  of 
E8  is  given  by 


E  8 


±  t 


0.025 


I 


and  described  herein  as  empirical  Bayes  type  1. 

4.  Lwin  &  Maritz  have  an  alternative  suggested  procedure  for  deriving  an 
empirical  Bayes  confidence  interval.  Three  different  intervals  of  this  type 
are  calculated,  viz., 

(i)  an  interval  based  on  assuming  a  normal  distribution  for  the  error 
term,  and  denoted  by  type  2; 

(ii)  an  interval  based  on  assuming  a  t-di  stri  but  i  on  for  the  error,  and 
estimating  its  variance  by  the  standard  method — a  type  3  empirical  Bayes 
interval; 
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(iii)  similar  to  (ii),  except  that  a  maximum  likelihood  approach  is  used  to 
estimate  the  variance.  This  we  call  a  type  4  interval. 

All  these  intervals  have  the  property  (see  Lwin  &  Maritz  (1980))  that 
they  can  be  semi-infinite.  As  such  intervals  make  the  calculation  of 
average  interval  length  impossible,  and  as  their  occurrence  is  rare,  we  omit 
them  from  our  calculations,  and  merely  record  their  frequency  of  occurrence. 
5.  It  is  possible  to  construct  confidence  intervals  based  on  the  predictive 
distribution  of  Aitchison  &  Dunsmore,  but  since  this  involves,  for  a  single 
y-value,  repeated  numerical  integration  it  is  omitted  from  the  simulation 
study . 

9.2  Design  of  the  Simulation  Study 

The  design  of  the  simulation  study  is  identical  with  that  in 

Section  7,  except  that,  due  to  the  omission  of  the  (computationally  lengthy) 

Aitchison  &  Dunsmore  estimators  E4  and  E5,  we  are  able  to  greatly  expand  the 

2 

number  of  replications  at  each  setting  of  p  ,  N.  In  fact  we  now  repeat  the 

experiment  (of  generating  a  sample,  and  100  additional  pairs  of  observations 

for  prediction  based  on  the  sample)  2000  instead  of  100  times.  This  means 
2 

that  for  each  p  ,  N  configuration,  we  are  constructing  200,000  confidence 
intervals.  The  intervals  so  constructed  are  compared  in  terms  of  percent 
coverage  and  average  length. 

The  results  are  presented  in  Tables  9.1  through  9.5;  these  tables 
have  a  direct  correspondence  with  Tables  8.1  through  8.5. 


32 


TABLE  9.1 


Performance  of  Various  Confidence  Intervals 
X:  Normal  Error:  Normal 

Confidence  Interval 

p-squared  Inverse  Emp.  Bayes  Classical 

_  Sample  %  hv .  %  Av.  %  Av.  % 


p  Size 

Cov. 

Lengf  h 

Cov. 

Length 

Cov. 

Length 

exist 

.1  20 

94.8 

4.0 

93.1 

3.7 

94.6 

35.6 

66.6 

(22.7) 

40 

94.9 

3.8 

94. 1 

3.7 

96.6 

37.0 

83.2 

(12.1) 

80 

94.9 

3.7 

94.5 

3.7 

97.3 

32.4 

96.0 

(2.9) 

.5  20 

94.8 

3-0 

93.6 

2.8 

96. 1 

6.2 

(99.6) 

(0.4) 

40 

95.1 

2.9 

94.4 

2.8 

95.7 

4.4 

100.0 

0 

80 

94.7 

2.8 

94.3 

2.7 

95.0 

4.1 

100.0 

0 

.9  20 

94.9 

1.3 

93.5 

1.3 

95.0 

1.4 

100.0 

0 

40 

95.0 

1.3 

94.4 

1.2 

95.1 

1.3 

100.0 

0 

80 

94.9 

1.2 

94.6 

1.2 

94.9 

1.3 

100.0 
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TABLE  9.1  (continued) 


p  squared 

Emp.  Bayes 

2 

Emp 

.  Bayes 

3 

Emp 

.  Bayes 

Sample 

Size 

* 

Cov. 

Av. 

Length 

% 

Exist 

% 

Cov. 

Av. 

Length 

% 

Exist 

% 

Cov. 

Av. 

Length 

.1  20 

86.9 

3.2 

100.0 

88.3 

3.5 

100.0 

88.5 

3.5 

40 

90.6 

3.5 

100.0 

92.2 

3.7 

100.0 

92.4 

3.8 

80 

93.5 

3.7 

100.0 

94.1 

3.8 

100.0 

94.2 

3.9 

.5  20 

82.7 

2.4 

100.0 

78.2 

2.2 

100.0 

87.2 

2.5 

40 

90.2 

2.7 

100.0 

86.4 

2.5 

100.0 

89.7 

2.7 

80 

92.6 

2.7 

100.0 

89.0 

2.5 

100.0 

92.3 

2.8 

.9  20 

78.8 

1 . 1 

99.0 

42.6 

0.6 

98.6 

50.2 

0.7 

40 

86.3 

1.1 

99.8 

45.5 

0.5 

99.4 

55.5 

0.6 

80 

90.5 

1.2 

99.8 

46.6 

0.5 

99.6 

59.2 

0.6 

0 


4 

% 

Exist 

100.0 

100.0 

100.0 

100.0 

100.0 

100.0 

98.9 

99.6 

99.8 


TABLE  9.2 


Performance  of  Various  Confidence  Intervals 
X:  Normal  Error:  t  3  d.f. 

Confidence  Interval 


p -squared 

Inverse 

Emp. 

Bayes  1 

Classical 

Sample 

Size 

* 

Cov. 

Av. 

Length 

% 

Cov. 

Av. 

Length 

% 

Cov. 

Av. 

Length 

% 

exist 

.1  20 

94.5 

4.0 

92.4 

3.7 

93.9 

39.1 

69.7 

(15.1) 

MO 

94.8 

3.8 

93.8 

3.6 

95.4 

46.8 

(5.9) 

80 

94.9 

3.7 

94.3 

3.6 

95.9 

26. 1 

95.5 

(1.6) 

.5  20 

93-8 

2.8 

92.2 

2.6 

94.1 

7.3 

1.3 

(1.5) 

40 

95. 1 

2.8 

94. 1 

2.7 

94.7 

4.6 

99.7 

80 

95.5 

2.8 

95.1 

2.8 

95.2 

4. 1 

99.9 

.9  20 

93  • u 

1.2 

92.3 

1  .  1 

93.3 

1.3 

99.9 

40 

94.6 

1.2 

94.  1 

1.2 

94.5 

1.3 

100.0 

80 

95.2 

1.2 

95.0 

1 . 2 

95.  1 

1.3 

100.0 
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Table  9.2  (continued) 


p  Squared 

Emp 

.  Bayes 

2 

Emp 

.  Bayes 

3 

Emp 

.  Bayes 

4 

Sample 

Size 

% 

Cov. 

Av. 

Length 

% 

Exist 

% 

Cov. 

Av. 

Length 

% 

Exist 

% 

Cov. 

Av. 

Length 

% 

Exist 

.1  20 

85.9 

3.1 

100.0 

87.2 

3.2 

100.0 

87.5 

3-3 

100.0 

40 

91.9 

3.5 

100.0 

93.5 

3.8 

100.0 

93.7 

3.9 

100.0 

80 

93.9 

3.6 

100.0 

94.5 

3.8 

100.0 

94.6 

3.8 

100.0 

.5  20 

82.2 

2.2 

99.8 

75.0 

1.9 

100.0 

77.9 

2. 1 

100.0 

40 

88.3 

2.4 

99.9 

84.2 

2.1 

100.0 

86.3 

2.2 

100.0 

80 

91.7 

2.5 

99.9 

45.3 

0.6 

98.9 

48.  1 

0.6 

98.9 

.9  20 

75.9 

1  .0 

98.6 

45.3 

0.6 

98.9 

48.1 

0.6 

98.9 

40 

85.9 

1  . 1 

99.5 

50.5 

0.5 

99.3 

54.3 

0.5 

99.4 

80 

90.5 

1  .  1 

99.9 

51.5 

0.4 

99.7 

55.7 

0.4 

99.7 

TABLE  9.3 

Performance  of  Various  Confidence  Intervals 
X:  Normal  Error:  Stretched  Normal 

Confidence  Interval 


p-squared 

Inverse 

Emp. 

Bayes  1 

Classical 

Sample 

Size 

% 

Cov. 

Av. 

Length 

% 

Cov. 

Av. 

Length 

% 

Cov . 

Av. 

Length 

% 

exist 

.1  20 

94.5 

4.0 

92.6 

3.7 

94.2 

58.7 

67.5 
( 17.6) 

40 

94.7 

3.8 

93.8 

3.7 

95.3 

36.7 

18.6 

(8.5) 

80 

94.9 

3.7 

94.4 

3.6 

96.0 

26.7 

95.2 

(2.0) 

.5  20 

93.8 

2.9 

92.2 

2.7 

94.1 

6.3 

99.  1 
(  .5) 

40 

94.4 

2.7 

93.4 

2.7 

94.1 

4.3 

99.8 

80 

94.9 

2.7 

94.6 

2.7 

94.8 

4.1 

100.0 

.9  20 

93.0 

1.2 

91.9 

1  . 1 

93.0 

1.3 

1  00.0 

40 

93.9 

1  .2 

93.5 

1.2 

93.8 

1.3 

100.0 

80 

94.6 

1.2 

94.3 

1.2 

94.5 

1.3 

100.0 
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Appendix 


Given  y  -  y^,  if  we  define  El  to  be 


El  -  a*  ♦  B*yQ 


where 


B*  -  g^-  ,  and  a*  -  *  -  B*y  , 

yy 


using  conventional  notation,  and  E8  to  be 


E8 


°xB(y0 "  a)  +  7  °y 

"2  "2  7  *2 
B  a  +  a 
x  y 


where  B 

then 


S 

S 

xx 


y  -  B  x,  and  o 


xx 
n- 1 


E8 


S  S  S 

—  -^(y  -  11 
n-1  S  u0  iy 


s  -  b2s 
-  x  ^y--2  -X-X- 


XX 


XX*  . 


“2  Sxx 
6 


S  -  8  S 
yy  xx 

n-2 


and  if  n  is  reasonably  large,  so  that  n-1 


n-2,  then 
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9.3.4  The  Empirical  Bayes  (types  2-4) 

In  general,  these  intervals  do  not  perform  well,  particularly  in 

terms  of  coverage.  The  coverage  is  close  to  95/6  only  for  the  case  of  small 
2 

p  combined  with  large  N.  Otherwise  the  coverage  is  less  than  95/6,  and  in 

2 

some  cases  (particularly  for  large  p  )  very  much  less  than  95?.  As  we  have 
previously  noted  in  the  simulation  study  of  point  estimators,  the 
corresponding  point  estimators  also  perform  very  poorly  for  the  same  range 
of  parameter  values.  Intervals  of  this  type  are  not  to  be  recommended. 

9.3.5  General  Conclusions 

Of  the  six  different  intervals  studied,  that  associated  with  the 
inverse  point  estimator  is  uniformly  the  best.  It  is  by  far  the  most  robust 
to  departures  from  underlying  assumptions,  and  is  strongly  recommended  for 
use  in  construction  of  confidence  intervals  for  the  calibration  problem. 
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emerged,  even  in  cases  where  the  interval  existed:  in  some  such  cases,  the 
lower  interval  and  point  given  by  Brownlee  was  larger  than  the  upper  end¬ 
point.  The  percentage  of  such  points  is  given  in  parentheses  underneath  the 

%  existence  figures  in  each  table.  Once  again,  the  problem  arises 

2 

predominantly  in  a  small  p  ,  small  N  situation.  To  overcome  this 

difficulty,  we  interchanged  the  end-points  when  this  situation  arose. 

Having  made  this  adjustment,  the  interval  does  indeed  give  a  %  coverage 

close  to  95% .  However,  in  terms  of  interval  length,  it  performs  very  poorly 

2 

relative  to  the  other  intervals,  with  one  exception.  As  p  becomes  large 
2 

(p  »  .9),  the  average  length  tends  to  that  of  the  other  intervals.  An 

2 

explanation  of  this  behavior  is  provided  by  the  fact  that  as  p  becomes 
2  2 

large  (a  /a  -*■  0  in  our  model)  the  center  point  of  the  Brownlee  interval, 

y  * 

viz . 


X  +  7T. 


6(y  ~  a) _ 

t0.025”s2/E(xI-*)2 


which  in  general  (if  we  consider  the  average  lengths  of  the  95$  confidence 
interval)  is  not  a  very  good  estimator,  tends  to 


i.e.,  the  classical  estimator,  E2.  We  have  already  seen  that  there  is 


2 

reason  to  expect  this  estimator  to  be  good  for  large  p  . 


9.3  Discussion  of  the  Results 

We  discuss  each  of  the  interval  types  separately. 

9.3.1  The  Inverse 

This  interval  performs  extremely  well,  both  in  terms  of  $  coverage 
and  average  interval  length.  Of  the  intervals  studied  it  has  uniformly  the 
shortest  average  length  for  a  given  level  of  coverage.  Its  robustness  in 
terms  of  cjverage  is  very  good.  The  actual  coverage  does  not  fall  below  93$ 
in  any  of  the  five  distributional  situations  considered,  and  for  sample 
sizes  40  and  80  it  does  not  fall  below  9*1/6.  For  situations  where  X  is 
Normal,  the  coverage  is  very  close  to  95$. 

9.3.2  The  Empirical  Bayes  type  1 

In  terms  of  $  coverage  and  average  length,  this  empirical  Bayes 
interval  has  a  performance  profile  very  similar  to  that  of  the  inverse.  Its 
coverage,  in  general,  tends  to  be  somewhat  lower  than  the  required  95$, 
and  the  average  length  tends  to  be  marginally  less  than  that  for  the 
inverse.  Its  robustness  is  very  similar  to  that  described  in  relation  to 
the  inverse. 

9.3.3  The  Classical 

This  interval  performs,  in  general,  very  poorly.  In  the  first 

instance,  we  examine  the  situations  where  it  fails  to  exist.  The  final 

column  in  each  table  gives  the  percentage  of  simulations  for  which  this 

interval  existed.  In  general,  no  real  interval  existed  for  a  large 

2 

percentage  of  the  simulations  when  p  was  small  (.1)  and  particularly  so  if 
N  was  also  small.  The  $  of  non-existing  intervals  decreased  rapidly  (from 
c.  30-35$  to  4-5J)  as  N  increased  from  20  to  80.  A  further  difficulty  also 


43 


Table  9.5  (continued) 


p  Squared 

Emp 

.  Bayes 

2 

Emp 

.  Bayes 

3 

Emp 

.  Bayes 

4 

Sample 

Size 

% 

Cov. 

Av. 

Length 

% 

Exist 

% 

Cov. 

Av. 

Length 

t 

Exist 

* 

Cov. 

Av. 

Length 

% 

Exist 

.1  20 

86.7 

3.0 

99.9 

88.4 

3-4 

100.0 

88.7 

3.4 

100.0 

40 

92.3 

3.5 

99.9 

93.6 

3-8 

100.0 

93.7 

3.9 

100.0 

80 

93.6 

3.5 

99.9 

94.2 

3.6 

100.0 

94.2 

3.6 

100.0 

.5  20 

83.1 

2. 1 

99.5 

75.8 

1.8 

100.0 

79.1 

1.9 

100.0 

40 

89.3 

2.2 

99.8 

84.9 

2.0 

100.0 

86.9 

2.0 

100.0 

80 

91.6 

2.3 

99.8 

88.6 

2.0 

100.0 

90.2 

2.1 

100.0 

.9  20 

76.8 

0.9 

98.3 

44.2 

0.6 

99.  1 

47.8 

0.6 

99.2 

40 

86.9 

1 . 1 

99.0 

51 .7 

0.6 

99.5 

55.3 

0.6 

99.6 

80 

90.6 

1.1 

99.3 

55.5 

0.5 

99.7 

59.2 

0.5 

99.7 
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TABLE  9.5 

Performance  of  Various  Confidence  Intervals 
X:  t  Error:  t 

Confidence  Interval 


p -squared 

Inverse 

Emp. 

Bayes 

Classical 

Sample 

Size 

51 

Cov. 

Av. 

Length 

% 

Cov. 

Av. 

Length 

% 

Cov. 

Av. 

Length 

% 

exist 

.1  20 

93.2 

3-6 

91.5 

3.3 

94.3 

40. 1 

66.2 

(16.1) 

40 

94.0 

3.5 

93.1 

3.4 

95.4 

141.0 

79.0 

(9.1) 

80 

94.4 

3.5 

94.0 

3.5 

95.8 

27.2 

93.5 

(2.5) 

.5  20 

92.7 

2.6 

90.9 

2.4 

94.0 

7.0 

96.6 

(1.2) 

40 

94.0 

2.6 

92.7 

2.5 

94.5 

4.5 

99.2 

(0.1) 

80 

94.0 

2.6 

93.6 

2.5 

94.7 

4.0 

99.7 

(0.0) 

.9  20 

93.1 

1.2 

92.7 

1 . 1 

93-5 

1.3 

99.9 

(0.0) 

40 

94.0 

1 . 1 

93.2 

1.1 

94. 1 

1.3 

99.9 

(0.0) 

80 

94.4 

1 . 1 

94.1 

1.1 

94.6 

1.3 

100.0 

(0.0) 
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Table  9.4  (continued) 


L-  p  Squared 

Emp 

Bayes 

2 

Emp. 

Bayes 

3 

Emp. 

Bayes 

4 

t-  Sample 

% 

Av. 

% 

% 

Av. 

% 

* 

Av. 

% 

1  Size 

Cov. 

Length 

Exist 

Cov. 

Length 

Exist 

Cov. 

Length 

Exist 

£  .  i  20 

88.5 

3.3 

100.0 

90.3 

3.7 

100.0 

90.7 

3.8 

100.0 

g  40 

91.4 

3.3 

100.0 

92.5 

3.7 

100.0 

92.6 

3.9 

100.0 

8o 

93.6 

3.5 

100.0 

94.3 

3.6 

100.0 

94.2 

3.6 

100.0 

i  -5  20 

83.9 

2.2 

99.8 

78.4 

2.0 

100.0 

81.4 

2.2 

100.0 

4o 

90.4 

2.4 

99.9 

86.9 

2.2 

100.0 

89.5 

2.4 

100.0 

|  80 

92.0 

2.5 

99.9 

88.6 

2.3 

100.0 

91.4 

2.5 

100.0 

*  • 

*  « 

.9  20 

78.8 

1  .0 

98.4 

43.8 

0.6 

99.0 

51.4 

0.6 

99.2 

i 

40 

86.0 

1  . 1 

99.0 

45.3 

0.6 

99.4 

56.2 

0.7 

99.5 

-I-  80 

90.7 

1.2 

99.6 

47. 1 

0.5 

99.8 

60 . 6 

0.7 

99.2 

TABLE  9.4 


Performance  of  Various  Confidence  Intervals 
X:  t  Error:  Normal 

Confidence  Interval 
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Table  9.3  (continued) 


p  Squared 

Emp 

.  Bayes 

2 

Emp 

.  Bayes 

3 

Emp 

.  Bayes 

H 

Sample 

Size 

% 

Cov. 

Av. 

Length 

% 

Exist 

t 

Cov . 

Av. 

Length 

% 

Exist 

% 

Cov. 

Av. 

Length 

% 

Exist 

.1  20 

85.5 

3.1 

100.0 

87.5 

3.3 

100.0 

87.7 

3.4 

100.0 

40 

90.7 

3.H 

100.0 

92.6 

3.7 

100.0 

93.8 

3.7 

100.0 

80 

93.3 

3.6 

100.0 

9H.  3 

3.9 

100.0 

9H.3 

3.9 

100.0 

.5  20 

82.3 

2.3 

99.9 

76.9 

2. 1 

99.9 

79. H 

2.2 

100.0 

HO 

88.0 

2.  H 

100.0 

83. H 

2. 1 

100.0 

90.7 

2.5 

100.0 

80 

91 .9 

2.6 

100.0 

89.1 

2.3 

100.0 

90.7 

2.5 

100.0 

.9  20 

77.3 

1 . 1 

98.9 

H8. 1 

0.6 

99.8 

50.8 

0.6 

98.9 

HO 

86.5 

1.2 

99.5 

51.  H 

0.5 

99.2 

55.9 

0.6 

99. H 

80 

90.0 

1.2 

99.9 

53.9 

0.H 

99.8 

58.9 

0.6 

99.8 
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